Scalable model-based cluster analysis using clustering features

نویسندگان
چکیده

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Scalable model-based cluster analysis using clustering features

We present two scalable model-based clustering systems based on a Gaussian mixture model with independent attributes within clusters. They first summarize data into sub-clusters, and then generate Gaussian mixtures from their clustering features using a new algorithm — EMACF. EMACF approximates the aggregate behavior of each sub-cluster of data items in the Gaussian mixture model. It provably c...

متن کامل

Scalable, Balanced Model-based Clustering

This paper presents a general framework for adapting any generative (model-based) clustering algorithm to provide balanced solutions, i.e., clusters of comparable sizes. Partitional, model-based clustering algorithms are viewed as an iterative two-step optimization process—iterative model re-estimation and sample re-assignment. Instead of a maximum-likelihood (ML) assignment, a balanceconstrain...

متن کامل

On Model-Based Clustering, Classification, and Discriminant Analysis

The use of mixture models for clustering and classification has burgeoned into an important subfield of multivariate analysis. These approaches have been around for a half-century or so, with significant activity in the area over the past decade. The primary focus of this paper is to review work in model-based clustering, classification, and discriminant analysis, with particular attenti...

متن کامل

Scalable Clustering using MapReduce Programming Model

The aim is to implement a clustering algorithm, which will run in a distributed computing environment for which, a multi-node Hadoop cluster providing support for the Hadoop Distributed File System and the MapReduce Programming Model has been set up. In this paper, Exclusive and Complete Clustering (ExCC), a grid based algorithm, is implemented by scheduling consecutive MapReduce Jobs, for mass...

متن کامل

MutantX-S: Scalable Malware Clustering Based on Static Features

The current lack of automatic and speedy labeling of a large number (thousands) of malware samples seen everyday delays the distribution of malware signatures, leading to a low detection rate of new malware samples in the wild. In this paper, we design, implement and evaluate a novel, scalable framework, called MutantX-S, that can efficiently cluster a large number of samples into families base...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Pattern Recognition

سال: 2005

ISSN: 0031-3203

DOI: 10.1016/j.patcog.2004.07.012